首页> 外文OA文献 >High Performance Computational Analysis of Large-scale Proteome Data Sets to Assess Incremental Contribution to Coverage of the Human Genome
【2h】

High Performance Computational Analysis of Large-scale Proteome Data Sets to Assess Incremental Contribution to Coverage of the Human Genome

机译:大规模蛋白质组数据集的高性能计算分析,以评估对人类基因组覆盖的增量贡献

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Computational analysis of shotgun proteomics data can now be performed in a completely automated and statistically rigorous way, as exemplified by the freely available MaxQuant environment. The sophisticated algorithms involved and the sheer amount of data translate into very high computational demands. Here we describe parallelization and memory optimization of the MaxQuant software with the aim of executing it on a large computer cluster. We analyze and mitigate bottlenecks in overall performance and find that the most time-consuming algorithms are those detecting peptide features in the MS' data as well as the fragment spectrum search. These tasks scale with the number of raw files and can readily be distributed over many CPUs as long as memory access is properly managed. Here we compared the performance of a parallelized version of MaxQuant running on a standard desktop, an I/O performance optimized desktop computer ("game computer"), and a cluster environment. The modified gaming computer and the cluster vastly outperformed a standard desktop computer when analyzing more than 1000 raw files. We apply our high performance platform to investigate incremental coverage of the human proteome by high resolution MS data originating from in-depth cell line and cancer tissue proteome measurements.
机译:散弹枪蛋白质组学数据的计算分析现在可以以完全自动化且统计严格的方式进行,如免费提供的MaxQuant环境所示。涉及的复杂算法和大量数据转化为非常高的计算需求。在这里,我们描述了MaxQuant软件的并行化和内存优化,目的是在大型计算机集群上执行它。我们分析并缓解了总体性能瓶颈,发现最耗时的算法是那些检测MS数据以及片段谱搜索中肽段特征的算法。这些任务随原始文件的数量而扩展,并且只要适当地管理内存访问,就可以很容易地将这些任务分布在许多CPU上。在这里,我们比较了在标准台式机,I / O性能优化的台式机(“游戏机”)和群集环境上运行的MaxQuant并行版本的性能。当分析1000多个原始文件时,修改后的游戏计算机和群集将大大优于标准台式计算机。我们应用高性能平台,通过源自深入细胞系和癌症组织蛋白质组测量的高分辨率MS数据来研究人类蛋白质组的增量覆盖范围。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号